Accepted Manuscript 


Detection and spike gene characterization in porcine 
deltacoronavirus in China during 2016-2018 


Yu Zhang, Yao Cheng, Gang Xing, Jing Yu, Ao Liao, Liuyang Du, 
Jing Lei, Xue Lian, Jiyong Zhou, Jinyan Gu 



PII: 

DOI: 

Reference: 


SI 567-1348(19)30060-7 
https://doi.Org/10.1016/j.meegid.2019.04.023 
MEEGID 3870 


To appear in: 


Infection, Genetics and Evolution 


Received date: 
Revised date: 
Accepted date: 


17 December 2018 
8 April 2019 
22 April 2019 


Please cite this article as: Y. Zhang, Y. Cheng, G. Xing, et al., Detection and spike 
gene characterization in porcine deltacoronavirus in China during 2016-2018, Infection, 
Genetics and Evolution, https://doi.Org/10.1016/j.meegid.2019.04.023 


This is a PDF file of an unedited manuscript that has been accepted for publication. As 
a service to our customers we are providing this early version of the manuscript. The 
manuscript will undergo copyediting, typesetting, and review of the resulting proof before 
it is published in its final form. Please note that during the production process errors may 
be discovered which could affect the content, and all legal disclaimers that apply to the 
journal pertain. 






Detection and spike gene characterization in porcine deltacoronavirus in China during 2016-2018 


Yu Zhang a c ’ 1 , Yao Cheng a ’ c ’ \ Gang Xing b , Jing Yu a ’ c , Ao Liao d , Liuyang Du ac , Jing Lei a c , Xue Lian c , Jiyong Zhou b *, Jinyan 


Gu a ’ b ’ c * 


a MOE Joint International Research Laboratory of Animal Health and Food Safety, Institute of Immunology and College of 


Veterinary Medicine, Nanjing Agricultural University, Nanjing 210095, China 


b MO A key laboratory of Animal Virology, Department of Veterinary Medicine, Zhejiang University, Hangzhou 310058, China 


c Jiangsu Engineering Laboratory of Animal Immunology, College of Veterinary Medicine, Nanjing Agricultural University, 


Nanjing 210095, China 


d Ma ’anshan Shiji Animal Health Management Co. Ltd, Anhui, China 


* Corresponding authors. 


1 These authors contributed equally to this work. 


E-mail addresses: gjy@njau.edu.cn (J. Gu), jyzhou@zju.edu.cn (J. Zhou). 




Abstract 


Porcine deltacoronavirus (PDCoV) has been emerging in several swine-producing countries for years. In our study, 
719 porcine diarrhoea samples from 18 provinces in China were collected for PDCoV and porcine epidemic diarrhea 
virus (PEDV) detection. The epidemiological survey revealed that the positive rates of PDCoV, PEDV and 
coinfection were 13.07%, 36.72% and 4.73%, respectively. The entire spike (S) genes of eleven detected PDCoV 
strains were sequenced. Phylogenetic analysis showed that the majority of PDCoVs could be divided into three 
lineages: the China lineage, the USA/Japan/South Korea lineage and the Viet Nam/Laos/Thailand lineage. The China 
and the Viet Nam/Laos/Thailand lineages showed much greater genetic divergences than the USA/Japan/South 
Korea lineage. The present study detected one new monophyletic branch that contained three PDCoVs from China, 
and this branch was separated from the China lineage but closely related to the Viet Nam/Laos/Thailand lineage. 
The strain CH-HA2-2017, which belongs to this new branch, had a possible recombination event between positions 
27 and 1234. Significant amino acid substitutions of PDCoV S proteins were analysed and displayed with a three- 
dimensional cartoon diagram. The visual spatial location of these substitutions gave a conformational-based 
reference for further studies on the significance of critical sites on the PDCoV S protein. 
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Introduction 


Porcine deltacoronavirus (PDCoV) is a newly emerging enteropathogenic swine coronavirus that can cause acute 
diarrhoea and vomiting in pigs, leading to dehydration and death in newborn piglets (Chen et al., 2015b; Jung et al., 
2015; Ma et al.. 2015). The clinical infection symptoms caused by PDCoV are similar to but milder than those of 
porcine epidemic diarrhea virus (PEDV) (Hu et al., 2015). PDCoV is an enveloped, single-stranded, positive-sense 
RNA virus that belongs to the genus Deltacoronavirus within the family Coronaviridae of the order Nidovirales. 
PDCoV was first discovered in Hong Kong, China, in 2012 in a territory-wide molecular epidemiology study in 
mammals and birds (Woo et al„ 2012). PDCoV was subsequently reported in the United States in 2014 (Castro et 
al., 2012; Wang et al., 2014) and then in South Korea, mainland China, Japan, Thailand, Viet Nam and Lao People’s 
Democratic Republic (Lao PDR) (l^ee and Lee, 2014; Lorsirigool et al., 2016; Saeng-Chuto et al., 2017a; Song et 
al., 2015). The coinfection of PDCoV with other enteric viral pathogens such as PEDV, rotavirus or kobuvirus are 
commonly reported (Mai et al., 2018; Marthaler et al., 2014a; Marthaler et al., 2014b). 

The genome of PDCoV is composed of approximately 25.4 kb nucleotides and has a genomic organization similar 
to other coronaviruses: a 5’ untranslated region (UTR), open reading frame la (ORFla) and ORFlb encoding two 
overlapping polyprotein precursors; four structural protein genes encoding spike (S), envelope (E), membrane (M), 
and nucleocapsid (N); three accessory protein genes encoding NS6, NS7, and NS7a; and a 3 ’ UTR and poly (A) tail 
(Chen et al., 2015a; Fang et al., 2017; Fang et al.. 2016; Li et al., 2014; Woo et al., 2012). For coronaviruses, the 
function of the S protein is to recognize receptors and mediate viral entry into host cells. The S protein is composed 
of an N-terminal SI subunit for receptor binding and a C-terminal S2 subunit for the fusion of host and viral 
membranes. The cryo-electron microscopy structure of PDCoV S protein ectodomain (S-e) without the 
transmembrane anchor or intracellular tail in the prefusion state (Shang et al., 2018) showed that the SI subunit 


contained an N-terminal domain (Sl-NTD), a C-terminal domain (Sl-CTD) and connecting subdomains (SDs). The 




S2 subunit contained two central helices (CH-N and CH-C), a hydrophobic fusion peptide (FP), two heptad repeat 


(HR-N and HR-C) regions and connecting loops. Between SI and S2 are connecting SDs and a long loop. Because 
the S protein of the coronavirus is the major surface protein and the main target of the host humoural immune 
response, it is considered to be evolutionarily related and a focus of vaccine design (Graham et al., 2013). 

Several genetic and phylogenetic analyses using S genes or complete genomes have indicated that the global PDCoV 
strains separated clearly into three lineages: the China lineage, the USA/Japan/South Korea lineage and the Viet 
Nam/Laos/Thailand lineage (Mai et al., 2018; Saeng-Chuto et al., 2017b; Suzuki et al., 2018). In the present study, 
to further investigate the epidemiology and phylogenetics of PDCoV in China, a total of 719 porcine samples from 
18 provinces in China from 2016 to 2018 were simultaneously tested for PDCoV and PEDV. The S gene sequences 
of 11 PDCoV strains from PDCoV-positive samples were sequenced and analysed. Phylogenetic, sequence and 
recombination analyses were also performed. The results from this study will help to understand the prevalence of 
PDCoV strains in China and further provide more insights into the evolution and diversity of PDCoVs. 

2. Materials and methods 

2.1 Sample collection and molecular detection 

A total of 719 porcine samples, including faeces, faecal swabs or small intestines, were collected from sows, boars, 
finishers, or nursing piglets showing signs of diarrhoea in different commercial pig farms over a 27-month period 
(March 2016-June 2018) in China. The 18 sampling provinces are shown in Fig. 1 A. All of the samples were placed 
into separate clean containers with phosphate buffered saline, frozen, thawed three times, and then centrifuged for 
10 min at 845xg. The supernatants from the 719 samples were mixed with TRIzol for viral RNA extraction. Total 
RNA was dissolved in RNase-free water and carefully preserved at -70 °C before further use. Synthesis of cDNA 


for each sample was carried out using a RevertAid First Strand cDNA Synthesis Kit (Thermo Scientific, USA) 




according to the manufacturer’s instructions. The cDNAs were screened for the presence of PDCoV and PEDV. The 


detection of PDCoV by using real-time reverse transcription PCR (rRT-PCR) targeting the membrane (M) gene was 
performed as reported previously (Marthaler et al., 2014b), using AceQ qPCR Probe Master Mix (Vazyme, China) 
and performed on Light Cycler96, the specific procedure was as followed: 5 min at 95°C, followed by 40 cycles of 
15 sec at 95°C and 30 sec at 60°C. Primers and procedure used for detecting PEDV was as previous report (Chiou 

et al., 2017). 

2.2 Sequencing of the complete S gene of PDCoV 

To obtain the complete sequence of the PDCoV S gene, three pairs of primers established previously were 
synthesized to amplify three DNA fragments spanning each entire S gene (Wang et al., 2014). Phanta HS Super- 
Fidelity DNA Polymerase (Vazyme, China) was used and the overlapping sequences of the PCR products were 
sequenced (Sangon Biotech, China) and assembled into full-length S gene sequences using DNAMAN software. 

2.3 Sequence analysis 

Nucleotide and deduced amino acid sequences of the complete S gene of PDCoVs were aligned by the Clustal W 
program. Phylogenetic trees were constructed using the maximum likelihood method in Molecular Evolutionary 
Genetics Analysis (MEGA) software (version 7.0) (http://www.megasoftware.net/), and bootstrap values were 
estimated for 1000 replicates. To characterize the genetic divergence within and between the lineages, the distances 
within and between lineages were calculated by the Tamura-Nei model, and bootstrap values were estimated for 
1000 replicates. The significant substitutions displayed on the three-dimensional cartoon diagram of PDCoV S-e 
were performed with PyMOL software, the cryo-electron microscopy structure of prefusion PDCoV S-e was 
downloaded from the PDB protein data bank (http://www.rcsb.org/), and the PDB entry was 6B7n. Prediction of the 


recombinant events within PDCoV strains was conducted using the Recombination Detection Program version 4.0 




(RDP4) package with default settings (Martin et al., 2015). Six recombination detection methods, including RDP, 


Chimaera, BootScan, GENECONV. MaxChi and SiScan, were implemented to analyse the sequences. The criteria 
for determining recombination and breakpoints were a P -value < 0.05, and only putative recombination events 
detected by more than three methods were adopted. 

2.4 GenBank accession numbers for the PDCoV S gene 

Eleven PDCoV S genes sequenced in this study have been deposited in GenBank under the following accession 

numbers: MK040445-MK040455. 

3. Results 

3.1. PCR detection and sequence properties of the PDCoV S genes 

In total, 719 porcine samples collected from 18 provinces of China were used for PDCoV and PEDV infection 
detection; 94 samples were detected as PDCoV positive (13.07%), 267 samples were PEDV positive (36.72%), and 
34 samples were PDCoV and PEDV copositive, yielding a coinfection rate of 4.73%. Eleven diarrhoea samples 
derived from 4 provinces of China (Fig. IB) yielding low cycle threshold (lower than 25) in the rRT-PCR assay 
were chosen for PDCoV entire S gene sequencing, the full-length S genes were submitted to the GenBank database 
(Table 1). In the 11 detected PDCoV strains, 9 strains contained a 3480 nt length S gene with a 3-nt deletion (AAT, 
from 154 to 156), which is common in the majority of PDCoV strains in China. Another 2 strains (CH-HA1-2017 
and CH-HA3-2017) had a 3483 nt length S gene, which is common in the USA/Japan/South Korea lineage and the 
Viet Nam/Laos/Thailand lineage. 

3.2. Phylogenetic analysis and genetic divergence of the PDCoV S gene 


Phylogenetic analysis and genetic divergence of the 11 sequenced S genes were constructed together with 51 other 




PDCoV isolate sequences available in GenBank (Table SI). A phylogenetic tree was analysed using the maximum 


likelihood method. As shown in Fig. 2, most of the global PDCoV strains could be divided into three lineages: the 
China lineage, the USA/Japan/South Korea lineage and the Viet Nam/Laos/Thailand lineage. Of all the 11 sequenced 
strains, the strain CH-WH was notably phylogenetically separated from all lineages and showed a closer relationship 
to the China and USA/Japan/South Korea lineage strains than to the Viet Nam/Laos/Thailand lineage strains. 
Another 3 strains (CH-HA1, CH-HA2 and CH-HA3) were grouped into a new monophyletic branch separate from 
the China lineage, with the closest relationship to the Viet Nam/Laos/Thailand lineage strains. The other 7 strains 
(CH-DH1, CH-HX, CH-XS, CH-CZ1, CH-CZ2, CH-DH2 and CH-FY) were grouped into the China lineage, 
unsurprisingly. 

To analyse the global genetic divergence of PDCoV strains, the genetic distances of the China, USA/Japan/South 
Korea PDCoVs and Viet Nam/Laos/Thailand lineages were calculated using the Tamura-Nei model (Table 2). Of 
these three lineages, the China lineage showed the largest genetic divergence with an average distance (+ standard 
error) of 0.019 ± 0.001, the Viet Nam/Laos/Thailand lineage showed a slightly less genetic divergence of 0.018 ± 
0.002 than that in China lineage, and the USA/Japan/South Korea lineage showed the least genetic divergence of 
0.002 ± 0.000, which was nearly one-tenth of the China lineage. Among the three lineages, the genetic divergence 
between the China lineage and the Viet Nam/Laos/Thailand lineage was 0.037 ± 0.003, The genetic divergence 
between the USA/Japan/South Korea lineage and the Viet Nam/Laos/Thailand lineage was 0.036 ± 0.003. and the 
divergence between the China lineage and the USA/Japan/South Korea lineage was 0.017 ± 0.002. To further analyse 
the genetic divergence of the 11 sequenced China PDCoV strains, the distances of China-reported strains and 
sequenced strains were calculated. Compared with the reported China strains, the 11 strains showed a larger genetic 
divergence (0.021 > 0.016); additionally, the divergence of sequenced China strains and the USA/Japan/South Korea 


lineage was also greater than the divergence of reported China strains and the USA/Japan/South Korea lineage 




(0.021 >0.016). 


3.3 Comparative analysis of the deduced amino acid sequence of the S protein 

To analyse the genetic characteristics of the 11 sequenced China strains, the sequence alignment of the deduced 
amino acids of the S protein was compared with that of the representative reference PDCoV strains selected from 
every subbranch in different lineages based on the phylogenetic tree, including strains from the China lineage 
(HKU15-44, CHN-AH-2004, CHN-HN-2014, CHN/GS/2016/1, CHN-GD16-05, CHN-JS-2014, CHN-GD-2016, 
HKU15-S582N, CHN-HG-2017), the USA/Japan/South Korea lineage (KNU14-04. IWT/JPN/2014, 

USA/Iowal36/2015, OH-FD22, USA/Minnesota442/2014, USA/Indiana453/2014, USA/Illinoisl21/2014) and the 

Viet Nam/Laos/Thailand lineage (Vietnam/Binh21/2015, 2016/Lao, Thailand/S5011/2015). The significant 
substitutions in S proteins between lineages consist of SI and S2 domains harbouring 21 and 17 amino acid 
substitutions, respectively, as shown in Fig. 3A. Of the 11 sequenced PDCoVs, 3 strains, CH-HA1-2017, CH-HA2- 
2017 and CFI-FIA3 -2017, which were grouped into a new monophyletic branch in the phylogenetic tree, had identical 
substitutions with Viet Nam/Laos/Thailand strains at positions 14A/V, 136I/T. 140R/H. 229Q/F1, 431G/D, 571I/V 
and 670V/L. Among these 3 strains, CF1-HA1-2017 and CH-HA3-2017 had another 4 unique substitutions at 
positions 22V/L, 43T/S, 307I/V, 557F1/Q, and they owned the amino acid asparagine (N) at position 51, which is 
missing in the majority of China strains. In addition, another 3 sequenced strains, CH-DH1-2017, CH-HX-2018 and 
CH-XS-2018, were grouped into one single branch with 4 unique amino acid substitutions at positions 630L/A, 
639D/N, 819S/T and 867I/T in the S2 subunit. Substitutions at positions 38L, 40R and 94F were relatively 

conservative in most of the China strains. 

Using the cryo-electron microscopy structure of PDCoV S-e (residues 52-1017) (Shang et al., 2018). these significant 


amino acid substitutions were displayed on a three-dimensional cartoon diagram to analyse their possible 




significance (Fig. 3B). For the SI ectodomain (residues 52-552), except for the substitution (residue 307) mapped 


on the Sl-CTD, 11 other substitutions mapped on the Sl-NTD and 3 substitutions mapped on the SDs were all 
located on surface loops. The residue 307 located on a (3-sheet of S1 -CTD was exposed on the surface of the S trirner. 
For the S2 ectodomain (residues 553-1017), 2 substitutions were located on the loop of the FP, 6 substitutions were 
located on the helix of CH and HR, 3 on CH-N (residues 624, 630, 639), 2 on HR-N and 1 on CH-C (residues 867), 
2 substitutions (residues 557, 571) on SDs and 5 substitutions (residues 642, 666, 668, 670 and 907) on connecting 
loops. 

To further analyse the possible recombinant events of these 11 PDCoV strains sequenced in this study, alignments 
of S genes of the 11 PDCoVs along with reference sequences were analysed by six methods included in RDP4. A 
significant (P < 0.05) recombination event was detected by five methods (RDP, Chimaera, BootScan, MaxChi and 
SiScan) in the CH-HA2-2017 S gene between positions 27 and 1234, with CH-HA3-2017 as the major parent and 
HKU15-155 as the minor parent (Fig. 4). 

4. Discussion 

Since PDCoV was discovered in Hong Kong, China, in 2012, epidemiological investigations of PDCoV and other 
relevant enteroviruses have been reported frequently. Previous studies revealed that, as a very common coinfection 
pathogen with PDCoV, PEDV was more prevalent and more pathogenic than PDCoV in pig diarrhoeal samples 
(Jang et al., 2017; Mai et al„ 2018; Marthaler et al., 2014b; Song et al., 2015). In the present study, the positive rate 
of PEDV (36.72%) was higher than that of PDCoV (13.07%) in the 719 porcine diarrheic samples, and this 
prevalence was relatively consistent with that of other studies!Ajayi et al., 2018; Hsu et al., 2018; Song et al., 2015; 
Wang et al., 2018). 


In the phylogenetic analysis, most of the global PDCoV strains could be divided into three lineages: the China 




lineage, the USA/Japan/South Korea lineage and the Viet Nam/Laos/Thailand lineage, this is consistent with 


previous reports(Lorsirigool et al„ 2016; Saeng-Chuto et al., 2017b). Three sequenced strains (CH-HA1, CH-HA2 
and CH-HA3), which formed a new monophyletic branch, were most closely related to the Viet Nam/Laos/Thailand 
lineage strains. This is the first report of China PDCoV strains having such a close relationship with the Viet 
Nam/Laos/Thailand lineage and separated from the China lineage, strains collected from China normally clustered 
to the China lineage(Dong et al., 2016; Liu et al.. 2018). Do these strains clustered in this new monophyletic branch 
originate from recombination or evolution? The recombination analysis showed that the strain CH-HA2-2017 in this 
special branch indeed had a possible recombinant event. The minor parent HKU15-155 was from the China lineage, 
and the major parent CH-HA3-2017 was from this same new branch but not the Viet Nam/Laos/Thailand lineage. 
Is there any possibility that the major parent is recombinant also? It needs to be further studied until enough PDCoV 
sequences are available. However, the global genetic divergence analysis of PDCoV showed that the relationship 
between the China and Viet Nam/Laos/Thailand lineages (0.037 + 0.003) was farther than that of the China and 
USA/Japan/South Korea lineages (0.017 + 0.001), this result makes the close relationship between China’s new 
branch and the Viet Nam/Laos/Thailand lineage incomprehensible, as these three lineages originated from the same 
or different ancestors and need to be intensely analysed with more sequences. Furthermore, the greater genetic 
divergence (0.036 and 0.037) of the Viet Nam/Laos/Thailand lineage among three lineages suggesting that the Viet 
Nam/Laos/Thailand strains may diverged from other early lineages. 

Because it has a non-swine ancestor, PDCoV may not yet be fully adapted to pigs, and it appears to continue to 
undergo genetic drift to become more adapted to pigs, even if pigs are considered the initial susceptible hosts (Jung 
et al., 2017). The S protein of the coronavirus is the main determinant of viral host range and tissue tropism; thus, 
substitutions in the S protein are critical for analysing the evolution, infectivity and pathogenicity of PDCoV. In the 


present study, significant amino acid substitutions of the S protein between global PDCoV strains were analysed. 




Although the significance of these substitutions is almost obscured currently, the three-dimensional cartoon diagram 


displaying gave a visual spatial location and made it possible to deduce their potential functions. The substitutions 
mapped on the surface loops of S1 -NTD and SDs may be associated with the connection of different S1 and S2 
subunits to form the crown-like structure, the residue 307 located on a |3-sheet of S1 -CTD was exposed on the surface 
of the S trimer and may be responsible for the antigenicity of the S protein. Thus, all of the substitutions of SI were 
located on the surface of the “crown” and may be associated with receptor binding capacity and antigenicity. 
Similarly, substitutions of S2 may be responsible for the viral characteristics of membrane fusion. A previous study 
revealed that the non-synonymous substitutions L107Q, A698S, A551V, L670I, and IllIV, which were also 
analysed in this study, were shared by the branches leading to Korean PDCoV isolates in 2014 and 2015 in the 
reconstruction of ancestral amino acid changes (Lee et al., 2016), further implying the significance of substitutions 
related to the ongoing potential adaptation to the natural host. 


In summary, PDCoV strains circulating in pig farms in China may not separate evolutionary. A new monophyletic 
branch of China strains closely related to the Viet Nam/Laos/Thailand lineage was detected in this study, although 
the ancestor or source has not been elucidated due to the limitation of available PDCoV sequences. The sites and 
locations of significant amino acid substitutions in the PDCoV S protein were analysed, but the exact biological 
functions need more experiments to be elucidated. Our study provides useful insights into the molecular 
characteristics of prevalent China PDCoV strains and provides references for further biological research on potential 
functional sites of the S protein and pathogenicity studies. Moreover, further analysis of molecular epidemiology 
based on the complete genome sequence is urgently needed. 
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Table 1. Information on the 11 China PDCoV strains sequenced in this study. 


No. 

Strains 

Farm 

Sample 

origin 

Coinfected 

with PEDV 

Geographical 

origin 

Collection 

time 

Length of the 

S gene (nt) 

Accession 

No. 

1 

CH-CZ1-2017 

GC 

Fecal 

+ 

Anhui/Chizhou 

9-Feb-2017 

3480 

MK040445 

2 

CH-CZ2-2017 

GC 

fecal 

+ 

Anhui/Chizhou 

29-Mar-2017 

3480 

MK040446 

3 

CH-FY-2017 

SM 

Fecal 

- 

Anhui/Fuyang 

13-Mar-2017 

3480 

MK040447 

4 

CH-HX-2018 

HX 

Fecal 

- 

Anhui/Hexian 

25-Jan-2018 

3480 

MK040448 

5 

CH-DH1-2017 

LY 

Fecal 

- 

Guangxi/Dahua 

31-Dec-2017 

3480 

MK040449 

6 

CH-DH2-2017 

LY 

Intestine 

+ 

Guangxi/Dahua 

10-Apr-2017 

3480 

MK040450 

7 

CH-WH-2017 

JX 

Fecal 

- 

Hubei/Wuhan 

1-May-2017 

3480 

MK040451 

8 

CH-XS-2018 

xs 

Fecal 

+ 

Hubei/Xishui 

6-Jan-2018 

3480 

MK040452 

9 

CH-HA1-2017 

HHT2 

fecal 

- 

Jiangsu/Huai’an 

17-Dec-2017 

3483 

MK040453 

10 

CH-HA2-2017 

HHT1 

Fecal 

- 

Jiangsu/Huai’an 

27-Dec-2017 

3480 

MK040454 

11 

CH-HA3-2017 

HHT2 

Fecal 

- 

Jiangsu/Huai’an 

8-Dec-2017 

3483 

MK040455 




Table 2. The average genetic distances within and between PDCoV lineages, which include S gene sequences of 25 

China-reported PDCoVs, 20 USA/Japan/South Korea PDCoVs, 6 Viet Nam/Laos/Thailand PDCoVs and the 11 

China PDCoVs sequenced in this study. The numbers of base differences per site from averaging total sequence pairs 

between groups are shown. Standard error (SE) estimates are shown and were obtained by a bootstrap procedure 

(1000 replicates); evolutionary analyses were conducted under Tamura-Nei model in MEGA7. 

Distance within 

lineages 

(mean ± S.E.) 


Distance between PDCoV lineages 

Lineages 

USA/ 

China a Japan/ 

South Korea 

VietNam/ 

Laos/ China b 

Thailand 

0.019 + 0.001 

0.002 ± 0.000 

0.018 + 0.002 

China a 

USA/Japan/ 

South Korea 

VietNam/Laos/ 

Thailand 

0.017+0.001 

0.037 + 0.003 0.036 + 0.003 

— — 

0.016 + 0.001 

China b 

0.016 + 0.001 

0.037 + 0.003 

0.021 +0.002 

China c 

0.021 + 0.002 

0.036 + 0.003 0.021 +0.001 


China PDCoVs containing reported and sequenced in this study. 


b China-reported PDCoVs. 
c China PDCoVs sequenced in this study. 




Figure legends 


Fig. 1 Map of provinces representing the locations of sample collection and sequenced PDCoV S genes. (A) 18 
provinces coloured in green represent the collection sites of 719 porcine samples. (B) Four provinces coloured in 
pink represent the locations of PDCoV-positive diarrhoea samples used for S gene sequencing. 

Fig. 2 Phylogenetic analysis of the S gene of PDCoV. The tree was constructed using the maximum likelihood 
method in the MEGA V.7.0 program. Numbers at nodes represent the percentages of 1000 bootstrap replicates 
(values<50 are not shown). The scale bar indicates the number of nucleotide substitutions per site. The 11 S genes 
sequenced in this study are indicated with “red dots”. The reference sequences obtained from GenBank are indicated 
by strain name and accession number. 

Fig. 3 Analysis of the deduced amino acid sequences of PDCoV S proteins. (A) The significant substitutions among 
PDCoV S proteins corresponded to the schematic drawing of the PDCoV S protein. SI: receptor-binding subunit, 

S2: membrane fusion subunit, Sl-NTD: N-terminal domain of SI. Sl-CTD: C-terminal domain of SI, CH-N and 

CF1-C: central helices N and C, FP: fusion peptide, HR-N and HR-C: heptad repeats N and C, SDs: subdomains. The 
dots represent amino acids that are identical to the strain HKU15-44. The red bars indicate the precise positions of 
deletions in the S proteins. (B) All significant substitutions displayed on the three-dimensional cartoon diagram of 
PDCoV S-e by using PyMOL software. 

Fig. 4 Recombination analysis by screening multiple sequence alignments of the PDCoV S gene with the 
Recombination Detection Program (RDP). The pairwise identities of the potential recombinant CF1-F1A2-2017, the 
major parent CF1-HA3-2017 and the minor parent HKU15-155 determine the potential recombinant region with a 
95% confidence interval, with recombination located at nt 27-1234 of the S gene. The bold dashes on top indicate 


the positions of informative sites, which are not identical or different in all three sequences. 
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Highlights: 


• The majority of global PDCoVs could be divided into three lineages. 


• Three sequenced PDCoVs from China were closely related to the Viet Nam/Laos/Thailand lineage. 


• The China lineage showed the greatest genetic divergence. 


• One sequenced strain CH-HA2-2017 had a possible recombination event. 
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